11,668 results • Page 1 of 234
Hello, I have a list of ~1,300 single bp sites and a fully annotated genome. I'd like to create a fasta file with only the 1,300 sites (with ±300 bp on each side). My sites are in an Excel file right now with chromosome, position
updated 2 hours ago • Anita
Hello everyone, I have a genome fasta file which has 16,941 sequences. Here are example of my "genome.fasta": ``` >scf7180000026027 GAATGCATACTGCATCGATA &gt...gt;scf7180000026030 TGCCCAAGTTGTGAAGTGTC ``` I want to find identical sequences in this genome fasta file, and return their ids. My final purpose are find and remove any identical sequences present in my genome fasta file
updated 5 hours ago • Sony
I have a trinity assembly file in fasta format.I want to do annotation of conus genome.There is limited storage on my server pc. Is there any way to do annotation
updated 6 hours ago • Asim Bin Arshad
Particularly, I am searching for a schematic that ilustrate each step of both pipelines from fasta to vcf/maf. This blogpost https://gatk.broadinstitute.org/hc/en-us/articles/9022487952155-Structural-variant-SV-discovery
updated 6 hours ago • Bioinformatics_begginner
and how to convert them into binary format. I have converted my files into tsv format. This is the header of my VCF file: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1_P-gDNA S1_P-cfDNA NORMAL cfDNA Could some one please help
updated 6 hours ago • sainavyav22
pca") # Read eigenvec and eigenval files eigenvec <- read.table("eigenvec", header = FALSE) eigenval <- read.table("eigenval", header = FALSE) # Assign column names to eigenvec colnames(eigenvec) <- c("SampleID
updated 13 hours ago • Ali
infile) # Default delimiter is comma writer = csv.writer(outfile) # Write header to output file header = next(reader) writer.writerow(header + ['Decimal Latitude', 'Decimal Longitude']) # Convert each row and
updated 3 days ago • kuttibiotech2009
errors. 3. Use PretextView and its features for manual curation. 4. How to obtain a genome curated fasta file using the Rapid Curation pipeline. 5. Become familiar with additional tools used to curate more challenging genomes
Hello Everyone. I am working with the sra data for whole exome sequence analysis. I am facing a problem regarding the sam file that I created after alignment. I am adding all the steps. **fastq-dump --split-files SRR1178899.sra** **fastqc *.fq** **bwa mem -t 12 -Y -L 0 -M -R "@RG\tID:sample\tSM:sample\tPL:Illumina" /mnt/nas/reference_genome/BWA/mammals/hg38/genome.fa R1_step1.fq R2_step1.fq &a…
updated 4 days ago • saifulislam99121
ln /usr/bin/cat bingo run cat test.txt bingo r cat test.txt # or run it d bingo cat test.txt ``` ### rename an executable file ```bash bingo mv <old_name> <new_name> ``` ### delete an executable file only file in `$HOMW/.bingo/bin` can be removed
updated 4 days ago • dwpeng
the lexical analyzer generator; the darwin and xopen defines are # workarounds for some macOS 12 header file issues; e.g.: # sff.c:1615:19: error: implicitly declaring library function 'strdup' with type 'char *(const char *) # see also
updated 5 days ago • Rodolfo Adrián
input_file /opt/vep/files/${inputVcf_file} \ --output_file /opt/vep/files/${output_file} \ --fasta /opt/vep/.vep/custom/references/Homo_sapiens_assembly38.fasta \ --allele_number \ --individual all \ --per_gene Why is this
read names and insert sizes" lists an array of numbers per read pair, that does not resemble the headers (#id,numericID,insert,status,mismatches). Shown below: **This is the --outinsert:** #id numericID insert status mismatches...sized paired fastq files as separate runs, and then a shorter paired fastq file where we changed the headers. **This is what prints to screen for the…
updated 5 days ago • chrisk
Hello, I run kallisto on my data and I am in the process of assigning gene names to my data. I tried to do this in 2 different ways but I get different results. The first way I tried is shown below using the t2g.py from https://github.com/pachterlab/kallisto-transcriptome-indices/releases: #Create the transcripts_to_genes file python t2g.py --use_version <homo_sapiens.grch38…
updated 6 days ago • bioinfo
I have previously used the biomart webportal to dow nload fastas for the 3'utrs of a gene-stable ensemble id list. Typically I limit my output to "MANE Select" as I am trying to get just one
updated 6 days ago • RNAseqer
I am seeking help with Augustus gene prediction! I am performing a whole genome assembly of a plant species. I have completed the gene prediction using the Augustus pipeline. The output file is of format `.gff` . Now I want to perform the gene annotation by performing `BLAST` for which I need the coding sequences in a `.fasta.` file. This is the method that I've thought of approaching. …
updated 7 days ago • Vijith
files only exist with UCSC chromosome nomenclature, but not for Ensembl. I know there are ways to rename these files, but since they have so many non-standard contigs, I have the feeling that might get a little messy. So, my current...since my BAM files are also trimmed to the CDS of some genes that are all on the main chromosomes. Renaming these is straightforward. However, I don't know, if M…
updated 7 days ago • gernophil
contigs [M::process] read 296298 sequences (20000115 bp)... [main_samview] fail to read the header from "-". [W::hts_set_opt] Cannot change block size for this format samtools sort: failed to read header from "-" Your insights
updated 7 days ago • Vahid
Hi, I am looking for a fasta file that contains mouse rRNA sequences, but I noticed that the links I searched on the internet point to some different
updated 7 days ago • octpus616
W::bcf_hrec_check] Invalid tag name: "1000gALT" [W::vcf_parse_info] INFO '.' is not defined in the header, assuming Type=String [W::bcf_hrec_check] Invalid tag name: "." Error encountered while parsing the input at 1:121387974...W::bcf_hrec_check] Invalid tag name: "1000gALT" [W::vcf_parse_info] INFO '.' is not defined in the header, assuming Type=String [W::bcf_hrec_check] Invalid tag name:…
updated 8 days ago • Matteo Ungaro
Hello everyone, I'm new to Rstudio, and i'm a little bit stuck. I'm trying to run the code of cibersort for the deconvolution of RNAseq samples using the LM22 signature matrix provided. I did a previous differential analysis with DESeq2, and used the normalized matrix of my analysis to run the cibersort script. here is my code: `if (!require(CIBERSORT))devtools::install_github("Moonerss/CIBERSO…
updated 9 days ago • Azra
longest ORF in that identified sequence? Idenfity all repeats in a sequence for all sequences in the FASTA, along with how many times each repeat occurs and which is the most frequent repeat.” The primary problem I think I have...is that I don’t know how to reference the sequences inside a FASTA file beyond what I have already, so my has_codon section of code isn’t working like I think it should…
updated 11 days ago • cput
broker_name,sample_title,nominal_sdev,first_created&amp;format=tsv&amp;download=true&amp;limit=0 headers = {"User-Agent": generate_user_agent()} download = s.get(url, headers=headers, allow_redirects=True) with open((os.path.join
updated 11 days ago • Giulia
score-client view --object-id 28358cf3-fba0-51a3-8b93-104bd5d48b23 --reference-file /home/victor/ref-fasta/GRCh38_full_analysis_set_plus_decoy_hla.fa --output-dir /media/victor/c1d5c312-b546-4d5e-b24f-72dbe9e6f18f/javier_CPTAC...per_patien/test However, it only gives me the header and as a SAM file. Has anyone used the query option and obtained the correct results
updated 11 days ago • Javier
the issue but still I get 1.3x greater than hisat. code below: ``` # Run HISAT2 ....... # extract header from bam and save to sam file ….. #extract uniquely concordant reads samtools view sample-sorted.bam | \ awk 'BEGIN{FS="\t";OFS...if ($NF=="NH:i:1" &amp;&amp; $(NF-2)=="YT:Z:CP"){print $0}}' &gt; \ sample-for-subread.sam # merge header and sam file above …. # sam to bam …. #s…
updated 12 days ago • Prawesh
paired-end reads for a single plant sample that I have assembled using megahit, resulting in a FASTA file of contigs. This will act as my "reference genome". File 2: FASTA file of contigs generated from de novo assembly of ddRADseq
updated 12 days ago • Lemonhope
Hi, I wonder how the samtools consensus work without explicitly pointing out the reference genome. If I intend to add a reference genome to generate the consensus sequence, is it possible based on samtools? Thanks a lot. Reference: https://www.htslib.org/doc/samtools-consensus.html
updated 12 days ago • me
CPU sec, 40.282 real sec [E::sam_hrecs_update_hashes] Duplicate entry "scf7180000010076" in sam header samtools view: failed to add PG line to the header And this is command that I run for mapping: bwa mem -t 8 -M -R '@RG\tID:SAMPLE_PE...ERR3890922.sam I currently using samtools 1.19.2 and BWA 0.7.17 I don't understand why SAM header has "Duplicate entry" and what sh…
updated 13 days ago • Sony
Hi, I am trying to do some differential expression experiments on my bacteria strain and I am very new to the field. I aligned my (paired-end) reads with STAR to both a genome and plasmid (using 2 separate fasta files + 1 combined gff file, which was checked for identical annotation format). Afterwards I used featureCounts, but unfortunately...the field. I aligned my (paired-end) reads with ST…
updated 13 days ago • heelpPlease
singularity exec vg.1.52.sif vg autoindex --workflow map --prefix AllRefGraph --ref-fasta Ref1.fasta Ref2.fasta Ref3.fasta Ref4.fasta Ref5.fasta Ref6.fasta Ref7.fasta Ref8.fasta Ref9.fasta Ref10.fasta
updated 14 days ago • sarumonsus
string_api_url = "https://version-11-5.string-db.org/api" output_format = "tsv-no-header" method = "interaction_partners" my_proteins = proteins['protein']) # Construct the request request_url = "/".join([string_api_url
updated 15 days ago • brandon
WBP4/gene_expression", countFiles[i]) counts &lt;- read.table(countFilePath, header = FALSE, col.names = c("gene", sampleNames[i])) countDataList[[i]] &lt;- counts } # Merge all count data into a single data frame by gene
updated 16 days ago • adi.gershon1
Hi all, I am trying to visualize the result of gene set enrichment analysis. This is my plot and the code in R. Is there any way that I can change the code then the text (names of gene sets) to be sorted as the example plot? Also, I want the square lines around the plot. here is my code: data &lt;- read.csv("GSEA_visualize.csv", header = TRUE, sep = ",") # Load required librar…
updated 16 days ago • Rob
org.Hs.eg.db) library(dplyr) library(edgeR) mat&lt;-read.table("~/Downloads/BRCA_exp_matrix.tsv",header=TRUE,sep="\t",fill=TRUE) library(readr) clinical &lt;- read.table("~/Downloads/clinical_info_TCGA-BRCA.tsv", sep = "\t", na.strings
updated 16 days ago • Natali
can't handle the BAM files due to memory issues). I considered using bcftools consensus to generate FASTA files from the VCFs, but HLA typing software requires reads (FASTQ or BAM files), and I haven't found a way to obtain those
f"/*.sra ./ ; done &lt; directories.txt # convert all the files in frw and rev fasta formats: fastq-dump --split-files *.sra </sample
updated 18 days ago • Begonia_pavonina
files of which the file named `file.fasta.masked` is of the same size as the original input fasta file, another file named `file.fasta.out` is of ~700mb, and a third file named `file.fasta.tbl` . I understand that `file.fasta.masked
are from different plasmid families (Rep types). I have exported each individual plasmid as a unique fasta file. I was just wondering if there is a way to assess genetic relatedness (and visibly display) between these plasmid
updated 19 days ago • nicole.kavanagh
I generated a pssm file from psi-blast and then I am using POSSUM to generate a pse-pssm file to run a programme, ASPIRER, for identifying unconventionally secerted proteins. However I am running into issues with it the code I used to generate the pssm file is as follows: ``` psiblast \ -db nr \ -query /nesi/project/vuw03925/software/POSSUM_Standalone_Toolkit/input/test_lottia.fasta \ -nu…
updated 19 days ago • rianna.collins
I am current trying to create pssm from FASTA using ncbi-blast-2.2.30+-x64-linux. I will use [UniProt][1] and [UnirRef90][2] db for the purpose. What will be the minimum requirements
updated 20 days ago • Nafi
I have a large fasta file of new species, I want to find extract a particular protein sequence. I also know a protein sequence of a similar...I have a large fasta file of new species, I want to find extract a particular protein sequence. I also know a protein sequence of a similar species
updated 20 days ago • anna
I am unable to get variance in feature count file, I think there is problem with the reference genome file and GTF file, I have downloaded it from ncbi, but I didn't get the satisfied results Please, anyone, suggest me the error I had?
based analysis of transcriptomic data in Galaxy Australia. I have downloaded the reference genome FASTA file and GTF file for Arabidopsis thaliana from NCBI. I have successfully mapped the raw reads to the reference genome
updated 20 days ago • Ravita
Hello everyone! Is there a way to introduce specific SNPs on the fasta sequence of a gene? I am working on a Pharmacogenetics project and I want to simulate reads for specific haplotypes...Hello everyone! Is there a way to introduce specific SNPs on the fasta sequence of a gene? I am working on a Pharmacogenetics project and I want to simulate reads for specific haplotypes of...PGX genes. T…
1.0" encoding="UTF-8"?><!DOCTYPE Query> <query count="" datasetconfigversion="0.6" formatter="TSV" header="0" uniquerows="0" virtualschemaname="plants_mart"><dataset interface="default" name="athaliana_eg_gene"><attribute name...1.0" encoding="UTF-8"?> <!DOCTYPE Query> <query count="" datasetconfigversion="0.6" formatter="TSV" header="0" uniquerows="0" virtua…
updated 20 days ago • Dora
research paper. I have seen many pdbs have less residues in a chain of a protein than the full FASTA sequence. Most likely, the cause is that they were unmodeled due to its going missing during the crystallization phase...1ZM1][2]. Here in chain B, the last couple of residues were unmodeled. Should I use the shortened FASTA from pdb or should I use the full FASTA for my dataset? [1…
updated 21 days ago • Nafi
edit feature (annotation) data annotLookup &lt;- read.table( annotfile, header = FALSE, sep = '\t', stringsAsFactors = FALSE, comment.char = "#", fill = TRUE) colnames(annotLookup) &lt;- annotLookup[2,] colnames(annotLookup
updated 24 days ago • hagl
the aligned haplotig reads as well as the chromosome it mapped to, and where it aligns in the header of the fastq so I can do chromosome level analysis on the haplotype assembly. How can I create this fastq? The same haplotig
updated 24 days ago • turcoa1
I'm trying to rename my clusters in a `Seurat` object. my old cluster IDs are numers ```r Idents(seuObj) &lt;- 'RNA_snn_res.0.1' levels(seuObj) [1] "0" "1...I'm trying to rename my clusters in a `Seurat` object. my old cluster IDs are numers ```r Idents(seuObj) &lt;- 'RNA_snn_res.0.1' levels(seuObj) [1] "0" "1" "2
updated 25 days ago • Assa Yeroslaviz
Hi I have a list of rsid and i want to search against clinvar database and print the condition_germline column with respect to each rsid. Anyway, i have got a script. ``` use strict; use warnings; use LWP::Simple; use HTML::TableExtract; # Read list of rsids from file my $rsids_file = 'rsids.txt'; open(my $fh, '&lt;', $rsids_file) or die "Can't open $rsids_file: $!"; my @rsids = &lt;$…
updated 25 days ago • ashaneev07
11,668 results • Page 1 of 234
Traffic: 1993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6